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DETAILED ACTION 
Response to Arguments 

1 . Applicant's arguments with respect to the claims have been considered but are 
moot in view of the new ground(s) of rejection. 

Claim Rejections - 35 USC § 102 

2. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1 ) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

3. Claim 1, 3-6, 14-19, 28-30, 32, 33, and 35-39 are rejected under 35 

U.S.C. 102(e) as being anticipated by Ramaswamy et al. (U.S. Patent No. 6,188,976 
filed Oct. 23, 1998). 

As per claims 1, 18, 19, and 28 Ramaswamy et al. discloses a method of using 
a tuning set of information to jointly optimize the performance and size of a language 
model, comprising: 

segmenting at least a subset of received textual corpus into segments by 
clustering every N-items of the received corpus into a training unit (C.6.lines 15-18), 
wherein resultant training units are separated by gaps (C.6.line 67, C.7.lines 1, 2-the 
separate classes inherently includes gaps, C.6.lines 15-18-Fig. 5-item 40'-the separate 
sub-corpora inherently include gaps); 
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and wherein N is an empirically derived value based, at least in part, on the size 
of the received corpus (C.3.lines 50-63-The number n can either be a predetermined 
fixed number or a number that dynamically varies with each language model building 
iteration. For example n may C.6.lines 13-40-his linguistic units as N-items in each 

r 

subcorpora). 

creating the tuning set from application-specific information (C.2. lines 44-48-his 
restricted corpora, C.S.Iines 40-44-the application is speech recognition); 

(a) training a seed model via the tuning set (C.6.lines 21-25-his initial reference 
language model as the seed model); 

(b) calculating a similarity within a sequence of the training units on either side of 
each of the gaps (C.6.lines 34-36-his relevance score calculator of "each unit" inherently 
includes "on either side of the gap"); 

(c) selecting segment boundaries that maximize intra segment similarity and 
inter-segment disparity (C.6.lines 36-41 -his threshold comparator and appropriate sub- 
corpora); 

(d) calculating a perplexity value for each segment based on a comparison with 
the seed model (C.6.lines 28-34); 

(e) selecting some of the^segments based on their respective perplexity values to 
augment the tuning set (C.6.lines 30-33-his stored linguistic units); 

iteratively refining the tuning set and the seed model by repeating steps (a) 
through (e) until a threshold (C.6.lines 44-63-"further language building iterations if 
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quality is deemed unsatisfactory"-! interpreted to include the above steps as explained); 
and 

refining the language model based on the seed model (C.6.lines 44-63-his new 
reference language model); 

As per claim 3, Ramaswamy et al. discloses all of the limitations of claim 1 , upon 
which claim 3 depends. Ramaswamy et al. further discloses: 

the tuning set of information is comprised of one or more application-specific 
documents (C.7. lines 6-9,-the application is e-mail, the documents comprise "show me 
the next e-mail...") 

As per claim 4, Ramaswamy et al. discloses all of the limitations of claim 1 , upon 
which claim 4 depends. Ramaswamy et al. further discloses: 

the tuning set of information is a highly accurate set of textual information 
linguistically relevant to (C.2.lines 55-62), but not taken from, the received textual 
corpus (C.3.lines 14-18, the received corpus-external corpus comprises many domains, 
however the seed corpus is linguistically related, but not taken from the external 
corpus). 

As per claim 5, Ramaswamy et al. discloses all of the limitations of claim 1 , upon 
which claim 5 depends. Ramaswamy et al. further discloses: 

a training set comprised of at least the subset of the received textual corpus 
(C.3.lines 6-8,1 4-1 7-test corpus is at least the subset of the received textual corpus). 

As per claim 6, Ramaswamy et al. discloses all of the limitations of claim 5, upon 
which claim 6 depends. Ramaswamy et al. further discloses: 
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ranking the segments of the training set based, at least in part, on the calculated 
perplexity value for each segment (CAIines 36-41 , C.6.lines 25-34, C.8.lines 34-36). 

As per claim 14, Ramaswamy et al. discloses all of the limitations of claim 1 , 
upon which claim 14 depends. Ramaswamy et al. further discloses: 

the perplexity value is a measure of the predictive power of a certain language 
model to a segment of the received corpus (C.4. lines 16-21). 

As per claim 15, Ramaswamy et al. discloses all of the limitations of claim 1 , 
upon which claim 15 depends. Ramaswamy et al. further discloses: 

ranking the segments of at least the subset of the received corpus based, at least 
in part, on the calculated perplexity value of each segment (C.4.lines 36-40, C.6. lines 
25-34, C.8.lines 34, 35); and 

updating the tuning set of information with one or more of the segments from at 
least the subset of the received corpus (CAIines 41-47, C.6.lines 37-43). 

As per claim 16, Ramaswamy et al. discloses all of the limitations of claim 1 5, 
upon which claim 16 depends. Ramaswamy et al. further discloses: 

one or more of the segments with the lowest perplexity value from at least the 
subset of the received corpus are added to the tuning set (CAIines 41-47- "...below the 
perplexity threshold...", C.6.lines 37-43). 

As per claim 17 Ramaswamy et al. discloses all of the limitations of claim 1 , 
upon which claim 17 depends. Ramaswamy et al. further discloses: 

utilizing the refined language model in an application (C.S.Iines 40-42, the 
application is speech recognition) to predict a likelihood of another corpus (C.5.lines 42- 
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45-the likelihood is interpreted as the "accuracy... for the current language modelMhe 
other corpus is the test corpus). 

As per claim 28, claim 28 sets forth limitations similar to claim 1 , and is thus 
rejected for the same reasons. Ramaswamy further teaches; 

refine the seed model with one or more segments of the received corpus based, 
at least in part, on the calculated perplexity values (C Alines 36-41 -his reference model 
as the seed model); 

iteratively refine the tuning set with segments ranked by the seed model 
(CAIines 36-40, C.6.lines 25-34, C.8.lines 34, 35) and in turn iteratively update the 
seed model via the refined tuning set (CAIines 8, 9, 36-47-his "added to relevant 
corpus", his each time... as the iterations); 

filter the received corpus via the seed model to find low-perplexity segments (see 
claim 16); and 

train the language model via the low-perplexity segments (C.6. lines 44-63-his 
new reference language model); 

As per claim 29, Ramaswamy et al. discloses all of the limitations of claim 28, 
upon which claim 29 depends. Ramaswamy et al. further discloses: 

the tuning set is dynamically selected as relevant to the received corpus 
(C.3.lines 47-54). 

As per claim 30, Ramaswamy et al. discloses all of the limitations of claim 28, 
upon which claim 30 depends. Ramaswamy et al. further discloses: 
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a dynamic lexicon generation function, to develop an initial lexicon from 
the tuning set (C.3.lines 42-44-the tuning set (seed corpus) is used to develop an initial 
lexicon (corpus)), and to update the lexicon with the select segments from the received 
corpus (C.3.lines 50-55- "...adding linguistic units to relevant corpus"-the relevant 
corpus being the updated lexicon). 

As per claim 32, Ramaswamy et al. discloses all of the limitations of claim 28, 
upon which claim 32 depends. Ramaswamy et al further discloses: 

a dynamic segmentation function (C.5.lines 1-3), to iteratively segment the 
received corpus (C.5.lines 1-3) to improve a predictive performance attribute of the 
modeling agent (C.5.lines 6-9-"to improve language model quality..." comprising 
evaluating perplexity change which is interpreted as the predictive performance). 

As per claim 33, Ramaswamy et al. discloses all of the limitations of claim 32, 
upon which claim 33 depends. Ramaswamy et al further discloses: 

the dynamic segmentation function iteratively re-segments the received corpus 
until the language model reaches an acceptable threshold (C.S.Iines 1,2, 9-15-the 
external corpus is segmented, iteratively by extracting linguistic units, until the language 
model is updated once a "...a certain number..." a threshold is reached). 

As per claim 35, Ramaswamy et al. discloses all of the limitations of claim 34, 
upon which claim 35 depends. Ramaswamy et al. further discloses: 

the data structure generator removes segments from the data structure that do 
not meet a minimum frequency threshold (C.4. lines 29-31 -it is well known that the 
relevancy of the segments is based in part on frequency), and dynamically re-segments 
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the received corpus to improve predictive capability while reducing the size of the data 
structure (C.S.Iines 1-3, C.S.Iines 6-9-"to improve language model quality..." comprising 
evaluating perplexity change which is interpreted as the predictive performance). 

As per claim 36, Ramaswamy et al. discloses a method of jointly optimizing the 
performance and size of a language model comprising: 

segmenting one or more relatively large language corpora into multiple segments 
of N items, wherein N is an empirically derived value based, at least in part, on the size 
of the received corpus (C.3.lines 50-63-"The number n can either be a predetermined 
fixed number or a number that dynamically varies with each language model building 
iteration. For example n may Fig. 5 item 40', and 41.S.1, 4ls.2...41.s.N, and 
C.6.lines 15-41 -his linguistic units as N-items in each subcorpora). 

selecting an initial tuning sample of application-specific data (see claim 1), the 
initial tuning sample being relatively small in comparison to the one or more relatively 
large language corpora (C.2. lines 51, 52, C.2.lines 44-63), wherein the initial tuning 
sample is used for training a seed model (see claim 1 -tuning set discussion), the seed 
model to be used for ranking the multiple segments from the language corpora (see 
claim 28); 

iteratively training the seed model to obtain a mature seed model, wherein the 
iterative training proceeds until a threshold is reached (see claim 1-each iteration 
interpreted to be more mature than the first), each iteration of the training including (see 
claim 1 -threshold discussion): 
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updating the seed model according to the tuning sample (see claim 1 -tuning 
sample as the tuning set); 

ranking each of the multiple segments according to a perplexity comparison with 
the seed model (see claim 28); 

selecting some of the multiple segments that possess a low perplexity; and 

augmenting the tuning sample with the selected segments (see claims 1 and 28); 

once the threshold is reached, filtering the language corpora according to the 
mature seed model to select low-perplexity segments (see claim 16); 

combining data from the low-perplexity segments (see claim 16-adding 
discussion); and 

training the language model according to the combined data (see claim 28-train 
...low perplexity discussion). 

As per claim 37, claim 37 sets forth limitations similar to claim 3 and is thus 
rejected for the same reasons and under the same rationale. 

As per claim 38, Ramaswamy teaches 36, and further discloses: 

wherein the threshold comprises one of a predetermined sized of the seed model 
or a sufficient application specificity of the seed model (C.6.lines 44-50-his sufficient 
number). 

As per claim 39, Ramaswamy teaches 36, and further discloses: 
pruning the language model utilizing an entropy based cutoff algorithm that uses 
only information embedded in the language model itself (C.S.Iines 6-21 -his language 
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model quality criteria and quality determination as the algorithm, his linguistic units as 
the embedded information). 

Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed 
or described as set forth in section 102 of this title, if the differences between the 
subject matter sought to be patented and the prior art are such that the subject 
matter as a whole would have been obvious at the time the invention was made 
to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was 
made. 

5. Claims 10-13, 31 and 34 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Ramaswamy et al. (U.S. Patent No. 6,188,976 filed Oct. 23, 1998) in 
view of Bangalore et al. (U.S. Patent No. 6,317,707 filed Dec. 7,1998). 

Ramaswamy et al. and Bangalore et al. are analogous art in that they both deal 
with language modeling. 

As per claim 10, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 1, upon which claim 10 depends. Ramaswamy et al. does not 
disclose: 

the calculation of the similarity within a sequence of training units defines a 
cohesion score. 

However, Bangalore et al. teaches the calculation of the similarity within a 
sequence of training units (C.3.lines 22, 23) defines a cohesion score (C.3.lines 15-19 
"very close relationship." is interpreted as the cohesion). Therefore, at the time of the 
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invention, it would have been obvious to one ordinarily skilled in the art to combine 
Ramaswamy et al. with Bangalore et al. The motivation for doing so would have been to 
determine how close or similar the training units were to each other for the benefit of 
maximizing the clustering process of related items (C.4. lines 12, 13). 

As per claim 11, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 10, upon which claim 1 1 depends. Ramaswamy et al. does not 
disclose: 

intra-segment similarity is measured by the cohesion score. 

However, Bangalore et al. teaches intra-segment similarity is measured by the 
cohesion score (C.3.lines 15-19, 22, 23). Therefore, at the time of the invention, it would 
have been obvious to one ordinarily skilled in the art to combine Ramaswamy et al. with 
Bangalore et al. The motivation for doing so would have been to measure how close or 
similar the intra-segment training units were to each other for the benefit of maximizing 
the clustering process of related items (C.3.lines 17-19, C.4.lines 13, 14), to better 
improve subsequent language modeling results. 

As per claim 12, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 10, upon which claim 12 depends. Ramaswamy et al. does not 
disclose: 

inter-segment disparity is approximated from the cohesion score. 

However, Bangalore et al. teaches inter-segment (C.3.lines 24, 25-the different 
vector coordinates interpreted inter-segments) is approximated form the cohesion score 
(C.4, lines 35-45, Table 2-the "Compactness Value"-determines the score and cohesion 
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and the "Class lndex"-determines the inter-segment disparity resulting from the 
cohesion score). Therefore, at the time of the invention, it would have been obvious to 
one ordinarily skilled in the art to combine Ramaswamy et al. with Bangalore et al. The 
motivation for doing so would have been to determine how disparate or distinct the 
inter-segment training units were to each other for the benefit of maximizing the 
clustering process of related items (C.3. lines 15-19, C Alines 14, 15), to better improve 
subsequent language modeling results. 

As per claim 13, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 1 , upon which claim 13 depends. Ramaswamy et al. does not 
disclose: 

the calculation of inter-segment disparity defines a depth score. 

However, Bangalore et al. teaches the calculation of inter-segment disparity 
defines a depth score (C.4. lines 12-16, 30-66-Table 2 the depth of the inter-segment 
disparity approximated form the cohesion score-compactness value- is indicated as the 
values are "deeper 1 ' as they are farther down the list). Therefore, at the time of the 
invention, it would have been obviousness to one ordinarily skilled in the art to combine 
Ramaswamy et al. with Bangalore et al. The motivation for doing so would have been to 
determine the depth of the disparity in a ranked manner to visually determine the 
relatedness of different classes or inter-segment disparity by index (C.4.Table 2-visual 
depth benefit). 

As per claim 31, Ramaswamy et al. discloses all of the limitations of claim 28, 
upon which claim 31 depends. Ramaswamy et al. does not disclose: 
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a frequency analysis function, to determine a frequency of occurrence of 
segments within the received corpus. 

However, Bangalore et al. teaches having a function based upon frequencies for 
each input word, which determines the frequencies of segments within the received 
corpus (C.3 .lines 45-47). Therefore, at the time of the invention, it would have been 
obvious to one ordinarily skilled in the art to combine Ramaswamy et al. with Bangalore 
et al. The motivation for doing so would have been to assist in building a cluster in the 
well known method of having a vector space to hold the clusters with the frequency of 
each segment being incorporated into the cluster for the benefit of maximizing the 
clustering segments (C.3. lines 18-20, 62, 63, C.4. lines 13, 14), to better improve 
subsequent language modeling results. 

As per claim 34, Ramaswamy et at. discloses all of the limitations of claim 32, 
upon which claim 34 depends. Ramaswamy et al. does not disclose: 

a frequency analysis function, to determine a frequency of occurrence of 
segments within the received corpus. 

However, Bangalore et al. teaches having a function based upon frequencies for 
each input word, which determines the frequencies of segments within the received 
corpus (C.2.lines 59, 60). Therefore, at the time of the invention, it would have been 
obvious to one ordinarily skilled in the art to combine Ramaswamy et al. with Bangalore 
et al. The motivation for doing so would have been to assist in building a cluster in the 
well known method of having a vector space to hold the clusters with the frequency of 
each segment being incorporated into the cluster for the benefit of maximizing the 
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clustering segments (C.3.lines 18-20, 62, 63, CAIines 13, 14), to better improve 
subsequent language modeling results. 



6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Lamont M. Spooner whose telephone number is 
571/272-7613. The examiner can normally be reached on 8:00 AM - 5:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on 571/272-7602. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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